chainCleaner improves genome alignment specificity and sensitivity
نویسندگان
چکیده
Motivation Accurate alignments between entire genomes are crucial for comparative genomics. However, computing sensitive and accurate genome alignments is a challenging problem, complicated by genomic rearrangements. Results Here we present a fast approach, called chainCleaner, that improves the specificity in genome alignments by accurately detecting and removing local alignments that obscure the evolutionary history of genomic rearrangements. Systematic tests on alignments between the human and other vertebrate genomes show that chainCleaner (i) improves the alignment of numerous orthologous genes, (ii) exposes alignments between exons of orthologous genes that were masked before by alignments to pseudogenes, and (iii) recovers hundreds of kilobases in local alignments that otherwise would fall below a minimum score threshold. Our approach has broad applicability to improve the sensitivity and specificity of genome alignments. Availability and Implementation http://bds.mpi-cbg.de/hillerlab/chainCleaner/ or https://github.com/ucscGenomeBrowser/kent. Contact [email protected]. Supplementary information Supplementary data are available at Bioinformatics online.
منابع مشابه
Evaluation of First and Second Markov Chains Sensitivity and Specificity as Statistical Approach for Prediction of Sequences of Genes in Virus Double Strand DNA Genomes
Growing amount of information on biological sequences has made application of statistical approaches necessary for modeling and estimation of their functions. In this paper, sensitivity and specificity of the first and second Markov chains for prediction of genes was evaluated using the complete double stranded DNA virus. There were two approaches for prediction of each Markov Model parameter,...
متن کاملIncreased alignment sensitivity improves the usage of genome alignments for comparative gene annotation
Genome alignments provide a powerful basis to transfer gene annotations from a well-annotated reference genome to many other aligned genomes. The completeness of these annotations crucially depends on the sensitivity of the underlying genome alignment. Here, we investigated the impact of the genome alignment parameters and found that parameters with a higher sensitivity allow the detection of t...
متن کاملGenome analysis Population-based structural variation discovery with Hydra-Multi
Summary: Current strategies for SNP and INDEL discovery incorporate sequence alignments from multiple individuals to maximize sensitivity and specificity. It is widely accepted that this approach also improves structural variant (SV) detection. However, multisample SV analysis has been stymied by the fundamental difficulties of SV calling, e.g. library insert size variability, SV alignment sign...
متن کاملPopulation-based structural variation discovery with Hydra-Multi
UNLABELLED Current strategies for SNP and INDEL discovery incorporate sequence alignments from multiple individuals to maximize sensitivity and specificity. It is widely accepted that this approach also improves structural variant (SV) detection. However, multisample SV analysis has been stymied by the fundamental difficulties of SV calling, e.g. library insert size variability, SV alignment si...
متن کاملgpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences
Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 33 11 شماره
صفحات -
تاریخ انتشار 2017